Redundant data storage in multi-die memory systems

ABSTRACT

A method for data storage includes, in a memory that includes at least N memory units, each memory unit including memory blocks, defining superblocks, each superblock including a respective set of N of the memory blocks that are allocated respectively in N different ones of the memory units, such that compaction of all the memory blocks in a given superblock is performed without any intervening programming operation in the given superblock. Data is stored in the memory by computing redundancy information for a selected portion of the data, and storing the selected portion and the redundancy information in the N memory blocks of a selected superblock.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims the benefit of U.S. Provisional PatentApplication 61/293,808, filed Jan. 11, 2010, U.S. Provisional PatentApplication 61/364,406, filed Jul. 15, 2010, and U.S. Provisional PatentApplication 61/373,883, filed Aug. 16, 2010, whose disclosures areincorporated herein by reference. This application is related to a U.S.patent application entitled “Redundant data storage schemes formulti-die memory systems” Ser. No. 12/987,175, filed on even date, whosedisclosure is incorporated herein by reference.

FIELD OF THE INVENTION

The present invention relates generally to memory devices, andparticularly to methods and systems for redundant data storage in memorysystems.

BACKGROUND OF THE INVENTION

Several types of memory devices, such as Flash memories, use arrays ofanalog memory cells for storing data. Each analog memory cell stores aquantity of an analog value, also referred to as a storage value, suchas an electrical charge or voltage. This analog value represents theinformation stored in the cell. In Flash memories, for example, eachanalog memory cell holds a certain amount of electrical charge. Therange of possible analog values is typically divided into intervals,each interval corresponding to one or more data bit values. Data iswritten to an analog memory cell by writing a nominal analog value thatcorresponds to the desired bit or bits.

Some memory devices, commonly referred to as Single-Level Cell (SLC)devices, store a single bit of information in each memory cell, i.e.,each memory cell can be programmed to assume two possible programminglevels. Higher-density devices, often referred to as Multi-Level Cell(MLC) devices, store two or more bits per memory cell, i.e., can beprogrammed to assume more than two possible programming levels.

Flash memory devices are described, for example, by Bez et al., in“Introduction to Flash Memory,” Proceedings of the IEEE, volume 91,number 4, April, 2003, pages 489-502, which is incorporated herein byreference. Multi-level Flash cells and devices are described, forexample, by Eitan et al., in “Multilevel Flash Cells and theirTrade-Offs,” Proceedings of the 1996 IEEE International Electron DevicesMeeting (IEDM), New York, N.Y., pages 169-172, which is incorporatedherein by reference. The paper compares several kinds of multilevelFlash cells, such as common ground, DINOR, AND, NOR and NAND cells.

Eitan et al., describe another type of analog memory cell called NitrideRead Only Memory (NROM) in “Can NROM, a 2-bit, Trapping Storage NVMCell, Give a Real Challenge to Floating Gate Cells?” Proceedings of the1999 International Conference on Solid State Devices and Materials(SSDM), Tokyo, Japan, Sep. 21-24, 1999, pages 522-524, which isincorporated herein by reference. NROM cells are also described byMaayan et al., in “A 512 Mb NROM Flash Data Storage Memory with 8 MB/sData Rate”, Proceedings of the 2002 IEEE International Solid-StateCircuits Conference (ISSCC 2002), San Francisco, Calif., Feb. 3-7, 2002,pages 100-101, which is incorporated herein by reference. Otherexemplary types of analog memory cells are Floating Gate (FG) cells,Ferroelectric RAM (FRAM) cells, magnetic RAM (MRAM) cells, Charge TrapFlash (CTF) and phase change RAM (PRAM, also referred to as Phase ChangeMemory—PCM) cells. FRAM, MRAM and PRAM cells are described, for example,by Kim and Koh in “Future Memory Technology including Emerging NewMemories,” Proceedings of the 24^(th) International Conference onMicroelectronics (MIEL), Nis, Serbia and Montenegro, May 16-19, 2004,volume 1, pages 377-384, which is incorporated herein by reference.

Some non-volatile memory systems store data in redundant configurationsin order to increase storage reliability and reduce the likelihood ofdata loss. For example, U.S. Patent Application Publication2010/0017650, whose disclosure is incorporated herein by reference,describes a non-volatile memory data storage system, which includes ahost interface for communicating with an external host, and a mainstorage including a first plurality of Flash memory devices. Each memorydevice includes a second plurality of memory blocks. A third pluralityof first stage controllers are coupled to the first plurality of Flashmemory devices. A second stage controller is coupled to the hostinterface and the third plurality of first stage controller through aninternal interface. The second stage controller is configured to performRedundant Array of Independent Disks (RAID) operation for data recoveryaccording to at least one parity.

As another example, U.S. Patent Application Publication 2009/0204872,whose disclosure is incorporated herein by reference, describes a Flashmodule having raw-NAND Flash memory chips accessed over a Physical-BlockAddress (PBA) bus by a controller. The controller converts logical blockaddresses to physical block addresses. In some embodiments, data can bearranged to provide redundant storage, which is similar to a RAIDsystem, in order to improve system reliability.

SUMMARY OF THE INVENTION

An embodiment that is described herein provides a method for datastorage in a memory that includes at least N memory units, each memoryunit including memory blocks. The method includes defining superblocks,each superblock including a respective set of N of the memory blocksthat are allocated respectively in N different ones of the memory units,such that compaction of all the memory blocks in a given superblock isperformed without any intervening programming operation in the givensuperblock. Data is stored in the memory by computing redundancyinformation for a selected portion of the data, and storing the selectedportion and the redundancy information in the N memory blocks of aselected superblock.

In some embodiments, erasure of all the memory blocks in the givensuperblock is performed without any intervening programming operation inthe given superblock. In an embodiment, storing the data includesstoring the selected portion in K memory blocks of the selectedsuperblock, 1≦K<N, and storing the redundancy information in N−Kremaining memory blocks of the selected superblock.

In another embodiment, storing the data includes programming theportions of the data and the redundancy information into N pagesbelonging respectively to the memory blocks of the selected superblock,without any intervening programming operation. In a disclosedembodiment, the method includes, upon a failure in a given memory unitfrom among the N memory units, recovering the data using at least someof the data and the redundancy information that is stored in the memoryunits other than the given memory unit.

In some embodiments, the method includes compacting the data stored in asource superblock of the memory by copying valid data frompartially-valid memory blocks in the source superblock to a destinationsuperblock and subsequently erasing the source superblock. Copying thevalid data may include moving the valid data from first blocks of thesource superblock to respective second blocks in the destinationsuperblock, such that each second block belongs to the same memory unitas the respective first block. The method may include running abackground process that copies parts of the data between different onesof the memory units.

In an embodiment, compacting the data includes, upon detecting that thesource superblock no longer contains any valid data but the destinationsuperblock is not full, selecting an additional source superblock andcopying additional valid data from the additional source superblock tothe destination superblock. In another embodiment, compacting the dataincludes selecting a given superblock for compaction based on adistribution of invalid data in the superblocks. Selecting the givensuperblock may include choosing the given superblock having a highestamount of the invalid data among the superblocks to serve as the sourcesuperblock.

In a disclosed embodiment, defining the superblock includes classifyinginput data to rarely-updated data and frequently-updated data, assigningfirst superblocks for storing the rarely-updated data and assigningsecond superblocks, different from the first superblocks, for storingthe frequently-updated data. In another embodiment, the method includesreserving at least one spare memory unit in addition to the N memoryunits, and replacing a failed memory unit from among the N memory unitswith the spare memory unit. The method may include temporarily using thespare memory unit for improving performance of data storage in the Nmemory units.

In some embodiments, the memory units are partitioned into multiplegroups that are associated with respective multiple processors, andstoring the data includes distributing storage of the data among themultiple processors. In another embodiment, the method includes, inresponse to a failure in a memory block of a given superblock,redefining the given superblock to include only the blocks other thanthe memory block having the failure, and storing subsequent data in theredefined superblock.

There is additionally provided, in accordance with an embodiment of thepresent invention, a method for data storage in a memory controller thataccepts data items from a host for storage in multiple memory units. Themethod includes defining a mapping between logical addresses assigned bythe host and respective physical storage locations in the memory units.Information indicative of the mapping is reported from the memorycontroller to the host. In the host, redundancy information is computedfor the data items, and respective first logical addresses are assignedto the data items and second logical addresses are assigned to theredundancy information responsively to the reported informationindicative of the mapping. The data items and the redundancy informationare stored by the memory controller in the physical storage locationsthat correspond to the respective first and second logical addressesassigned by the host.

In some embodiments, assigning the first and second logical addresses inthe host includes causing the data items and the redundancy informationto be stored in different ones of the memory units. In an embodiment,reporting the information includes reporting respective ranges of thelogical addresses that are mapped to the memory units according to themapping. In another embodiment, reporting the information includes,responsively to modifying the mapping in the memory controller,reporting to the host updated information that is indicative of themodified mapping.

There is also provided, in accordance with an embodiment of the presentinvention, a data storage apparatus that includes an interface and aprocessor. The interface is configured to communicate with a memory thatincludes at least N memory units, each memory unit including memoryblocks. The processor is configured to define superblocks, eachsuperblock including a respective set of N of the memory blocks that areallocated respectively in N different ones of the memory units, suchthat compaction of all the memory blocks in a given superblock isperformed without any intervening programming operation in the givensuperblock, and to store data in the memory by computing redundancyinformation for a selected portion of the data, and storing the selectedportion and the redundancy information in the N memory blocks of aselected superblock.

There is further provided, in accordance with an embodiment of thepresent invention, a memory controller that includes an interface and aprocessor. The interface is configured to communicate with a host. Theprocessor is configured to accept from the host via the interface dataitems for storage in multiple memory units, to define a mapping betweenlogical addresses assigned by the host and respective physical storagelocations in the memory units, to report information indicative of themapping from the memory controller to the host so as to cause the hostto compute redundancy information for the data items and assignrespective first logical addresses to the data items and second logicaladdresses to the redundancy information responsively to the reportedinformation indicative of the mapping, and to store the data items andthe redundancy information accepted from the host in the physicalstorage locations that correspond to the respective first and secondlogical addresses assigned by the host.

There is further provided, in accordance with an embodiment of thepresent invention, a method for data storage in a memory that includesat least N memory units that are partitioned into memory blocks. Themethod includes grouping input data into sets of N logical pages, eachset containing a respective portion of the data and redundancyinformation that is computed over the portion. Each set is stored in thememory such that the N logical pages in the set are stored in Ndifferent memory units. A selected memory block is compacted by readingat least one invalid page from the selected memory block, reading theredundancy information of the respective set to which the invalid pagebelongs, updating the read redundancy information based on the readinvalid page and on new data, and storing the updated redundancyinformation and the new data in the memory.

There is additionally provided, in accordance with an embodiment of thepresent invention, a data storage apparatus that includes an interfaceand a processor. The interface is configured to communicate with amemory that includes at least N memory units that are partitioned intomemory blocks. The processor is configured to group input data into setsof N logical pages, each set containing a respective portion of the dataand redundancy information that is computed over the portion, to storeeach set in the memory such that the N logical pages in the set arestored in N different memory units, and to compact a selected memoryblock by reading at least one invalid page from the selected memoryblock, reading the redundancy information of the respective set to whichthe invalid page belongs, updating the read redundancy information basedon the read invalid page and on new data, and storing the updatedredundancy information and the new data in the memory.

The present invention will be more fully understood from the followingdetailed description of the embodiments thereof, taken together with thedrawings in which:

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram that schematically illustrates a memorysystem, in accordance with an embodiment of the present invention;

FIG. 2 is a diagram that schematically illustrates a memory superblock,in accordance with an embodiment of the present invention;

FIG. 3 is a flow chart that schematically illustrates a method forredundant data storage using superblocks, in accordance with anembodiment of the present invention;

FIG. 4 is a diagram that schematically illustrates a process ofsuperblock compaction, in accordance with an embodiment of the presentinvention;

FIG. 5 is a flow chart that schematically illustrates a method forredundant data storage, in accordance with another embodiment of thepresent invention;

FIG. 6 is a flow chart that schematically illustrates a method forrecovery from die failure, in accordance with another embodiment of thepresent invention;

FIG. 7 is a flow chart that schematically illustrates a method forredundant data storage using logical address redundancy, in accordancewith another embodiment of the present invention; and

FIG. 8 is a flow chart that schematically illustrates a method forredundant data readout, in accordance with an embodiment of the presentinvention.

DETAILED DESCRIPTION OF EMBODIMENTS Overview

Embodiments of the present invention that are described hereinbelowprovide improved methods and systems for redundant data storage insolid-sate non-volatile memory, such as Flash memory. The disclosedtechniques may be carried out, for example, by a memory controller thatstores data in multiple memory units, such as individual memory dies orpackaged memory devices. The methods described herein enable the memorycontroller to recover the stored data in the event of memory unitfailure. Typically, these methods involve accepting data for storagefrom a host, computing redundancy information for the data, and storingthe data and the redundancy information in different memory units.

Data storage in solid-state non-volatile memory, e.g., Flash memory, hasseveral unique characteristics relative to other storage media:

-   -   Data is typically stored page by page, but erased block by        block. Each block comprises multiple pages.    -   Stored data cannot be updated in-place, and therefore updated        data is typically stored in another physical location.    -   The memory is specified to endure only a limited number of        programming and erasure cycles.

In some embodiments, the memory controller stores data efficiently inspite of the above-described characteristics by using logicaladdressing. Typically, the host addresses the data for storage usinglogical addresses. The memory controller maps the logical addresses torespective physical storage locations for storing the data. When thehost updates the data corresponding to a certain logical address, thememory controller stores the updated data in a new physical storagelocation, updates the logical-physical address mapping to remap thelogical address to the new physical storage location, and marks theprevious physical storage location as invalid. As data storagecontinues, particularly for non-sequential storage, the blocks graduallydevelop regions of invalid data (“holes”). The memory controllertypically carries out a “garbage collection” process, which copies validdata from partially-valid blocks and stores the data compactly in otherblocks. The cleared blocks are subsequently erased in preparation forstoring new data.

The unique characteristics of solid-state non-volatile memory may have aparticularly adverse effect on the performance of redundant storageschemes that are known in the art. For example, when implementing aredundant storage scheme, the redundancy information is typicallyupdated whenever the data is updated, thus causing a considerableincrease in the number of programming operations. This increase may betolerable in some storage media types, but not in non-volatilesolid-state devices that degrade after a limited number of programmingcycles. As will be explained below, the disclosed redundant storageschemes provide protection against memory unit failure while minimizingthe effects of redundant storage on the lifetime and performance of thesystem.

In some embodiments, the memory controller performs redundant storagebased on the physical storage locations assigned to the data. In theseembodiments, the memory controller defines sets of memory blocks thatare referred to as superblocks. Each superblock comprises N blocks, eachselected from a different memory unit. Each set of N corresponding pages(with one page from each block) in a given superblock is referred to asa super-page. Data storage is carried out such that the N pagesbelonging to a given super-page are programmed together, and the Nblocks belonging to a given superblock are erased together. Compactionof the memory is also performed in superblocks. The term “together” inthis context means that no intervening programming operations arepermitted to the super-page during programming of the N pages, and nointervening programming operations are permitted to the superblockduring compaction or erasure of the N blocks.

The memory controller stores data in a given N-page super-page byaccumulating N−K data pages (1≦K<N) in a buffer, computing K pages ofredundancy information for the N−K data pages, and storing the N−K datapages and K pages of redundancy information together in the N pages ofthe super-page. As a result, the redundancy information is updated onlyonce per N−K data pages and not per each incoming data page. Thus, feweradditional programming operations are caused by the redundant storage.In addition, the memory controller performs garbage collection bycompacting and erasing entire superblocks rather than individual blocks,so as to provide empty superblocks for subsequent redundant datastorage.

In alternative embodiments, the memory controller performs redundantstorage based on the logical addresses assigned to the data. In anexample embodiment, the memory controller selects N logical addresses,which are mapped respectively to N physical storage locations thatreside in N different memory units. When data is accepted for storagefrom the host, the memory controller uses N−K of these addresses forstoring data, and the remaining K pages for storing redundancyinformation. As can be appreciated, the logical addresses that are usedfor storing the redundancy information undergo a higher number ofprogramming operations, relative to the logical addresses that are usedfor storing the data. Therefore, in some embodiments, the memorycontroller stores the redundancy information at a lower storage densitythan the storage density used for storing the data. The lower storagedensity increases the endurance of the redundancy information to severecycling.

In some embodiments, the memory controller defines and applies themapping between logical addresses and physical storage locations, butthe redundancy scheme is defined and managed by the host. In theseembodiments, the memory controller provides information regarding thelogical-physical address mapping to the host, and the host uses thisinformation to define the redundancy scheme. In an example embodiment,the host defines the logical addresses such that the data and theredundancy information will be stored in different memory units.

In some embodiments, the memory controller uses invalid data, whichremains in obsolete blocks that were cleared by the garbage collectionprocess, as temporary backup. In an example embodiment, the garbagecollection process is defined such that valid data is read from a sourceblock in one memory unit, and then written to a destination block in adifferent memory unit. If the memory unit holding the destination blockfails, and the memory unit holding the (obsolete) source block was noterased yet, the memory controller recovers the data in question from thesource block. In some embodiments, the memory controller delays erasureof obsolete source blocks as much as possible, in order to increase theavailability of such temporary backup.

In some embodiments, the memory controller protects the stored dataagainst memory unit failure by encoding the data with an ErrorCorrection Code (ECC), and distributes the bits of each ECC code wordover multiple memory units. Typically, the ECC and the number of memoryunits are selected such that the code word is still decodable when allthe bits stored in a given memory unit are lost. In an embodiment, thebits that are stored in a failed memory unit are identified to the ECCdecoder as erasures.

In some embodiments, the memory controller stores data in N memoryunits, and redundancy information for the data in an N+1^(th) memoryunit. In order to retrieve the data, the memory controller issues arespective read command to each of the N+1 memory units. As soon asresponses for the first N read commands (which may comprise data and/orredundancy information) arrive, the memory controller reconstructs thedata from the N responses without waiting for the N+1^(th) response.Using this technique, data readout is not sensitive to temporarily slowresponse times, e.g., to delays caused by memory units that are busywith other tasks.

System Description

FIG. 1 is a block diagram that schematically illustrates a multi-devicememory system 20, in accordance with an embodiment of the presentinvention. System 20 accepts data for storage from a host 24 and storesit in memory, and retrieves data from memory and provides it to thehost. In the present example, system comprises a Solid-State Disk (SSD)that stores data for a host computer. In alternative embodiments,however, system 20 may be used in any other suitable application andwith any other suitable host, such as in computing devices, cellularphones or other communication terminals, removable memory modules suchas Disk-On-Key (DOK) devices, Secure Digital (SD) cards, Multi-MediaCards (MMC) and embedded MMC (eMMC), digital cameras, music and othermedia players and/or any other system or device in which data is storedand retrieved.

System 20 comprises multiple non-volatile memory devices 28, eachcomprising multiple analog memory cells 32. In the present example,devices 28 comprise NAND Flash devices, although various other suitablesolid state memory types, such as NOR and Charge Trap Flash (CTF) Flashcells, phase change RAM (PRAM, also referred to as Phase ChangeMemory—PCM), Nitride Read Only Memory (NROM), Ferroelectric RAM (FRAM),magnetic RAM (MRAM) and/or Dynamic RAM (DRAM) cells, can also be used.

In the context of the present patent application and in the claims, theterm “analog memory cell” is used to describe any memory cell that holdsa continuous, analog value of a physical parameter, such as anelectrical voltage or charge. Any suitable type of analog memory cells,such as the types listed above, can be used. In the present example,each memory device 28 comprises a non-volatile memory of NAND Flashcells. The charge levels stored in the cells and/or the analog voltagesor currents written into and read out of the cells are referred toherein collectively as analog values or storage values. Although theembodiments described herein mainly address threshold voltages, themethods and systems described herein may be used with any other suitablekind of storage values.

In each memory device 28, data is stored in memory cells 32 byprogramming the cells to assume respective memory states, which are alsoreferred to as programming levels. The programming levels are selectedfrom a finite set of possible levels, and each level corresponds to acertain nominal storage value. For example, a 2 bit/cell MLC can beprogrammed to assume one of four possible programming levels by writingone of four possible nominal storage values into the cell. Memory cells32 are typically arranged in one or more memory arrays (“planes”), eachcomprising multiple rows and columns. The memory cells in each row areconnected to a respective word line, and the memory cells in each columnare connected to a respective bit line.

Each memory array is typically divided into multiple pages, i.e., groupsof memory cells that are programmed and read simultaneously. Pages aresometimes sub-divided into sectors. In some embodiments, each pageoccupies an entire row of the array, i.e., an entire word line. Fortwo-bit-per-cell devices, for example, each word line stores two pages.In alternative embodiments, each row (word line) can be divided into twoor more pages. For example, in some devices each row is divided into twopages, one comprising the odd-order cells and the other comprising theeven-order cells. In an example implementation, a two-bit-per-cellmemory device may have four pages per row, a three-bit-per-cell memorydevice may have six pages per row, and a four-bit-per-cell memory devicemay have eight pages per row.

Typically, a given memory device comprises multiple erasure blocks (alsoreferred to as memory blocks), i.e., groups of memory cells that areerased together. Each memory device 28 may comprise a packaged device oran unpackaged semiconductor chip or die. In some embodiments, eachmemory device 28 comprises multiple dies. A typical SSD may comprise anumber of 4 GB devices. Generally, however, system 20 may comprise anysuitable number of memory devices of any desired type and size.

Each memory device 28 comprises an internal NAND controller 36, whichstores data in the memory cells of the device. Each NAND controller 36performs data storage and retrieval in its respective memory device inresponse to NAND commands. Each NAND command typically specifies writingor reading of a single memory page in the memory device. In system 20,data storage in memory devices 28 is carried out by a hierarchicalconfiguration of processors and controllers, which provides a highdegree of parallelization of storage tasks, and therefore achieves highstorage throughput with small latency.

Memory devices 28 in system 20 are arranged in subsets. A Memory SignalProcessor (MSP) 40 is associated with each subset and performs datastorage and retrieval in the subset. In some embodiments, each MSP 40comprises an Error Correction Code (ECC) unit 44, which encodes the datafor storage with a suitable ECC, and decodes the ECC of data retrievedfrom memory. In some embodiments, the subsets of memory devices 28 (eachwith its respective MSP) are aggregated into groups that are referred toas channels.

System 20 comprises a main controller 52, which manages the systemoperation. Main controller 52 comprises multiple channel controllers 48,each responsible for data storage and retrieval in a respective channel.The main controller accepts commands from host 24 to store and/orretrieve data, and communicates with the MSPs in order to carry outthese commands. Typically, the communication between the main controllerand the channel controllers, between the channel controllers and MSPs,and between the MSPs and the NAND controllers, comprise both data andcontrol aspects.

In some embodiments, main controller 52 comprises a host interfaceprocessor 53 and a main processor 54. The host interface processorforwards host commands between main controller 54 and host 24, andforwards data between the MSPs and the host. Main processor 54 executesthe host commands, performs Flash management functions on memory devices28, and communicates with the MSPs. In alternative embodiments, anyother suitable main controller configuration can also be used. Forexample, Flash management functions may be partitioned between mainprocessor 54 and MSPs 40. In some embodiments, the system comprisesvolatile memory, in the present example one or more Dynamic RandomAccess Memory (DRAM) devices 56, connected to main controller 52.

In an example embodiment, system 20 comprises a total of eight channelcontrollers 48 and thirty-two MSPs 40, i.e., each channel controllermanages four MSPs. In this configuration, each MSP may manage betweenone and eight memory devices 28. A given memory device 28 may comprisemultiple dies. Alternatively, however, any other suitable numbers ofchannel controllers, MSPs and memory devices can also be used. Thenumber of memory devices managed by each MSP may differ, for example,according to the storage capacity of system 20 and the storage densityof devices 28. In some embodiments, the channel controllers may beomitted. In these embodiments, the main controller stores and retrievesdata by communicating directly with the MSPs.

In some embodiments, system 20 performs logical-to-physical addresstranslation for storing data in memory devices 28. Typically, host 24addresses the data using logical addresses. Main controller 52 maps eachlogical address to a respective physical storage location in memorydevices 28. The mapping between logical addresses and physical storagelocations may change over time, e.g., when data at a certain logicaladdress is updated. Thus, main controller 52 typically maintains a datastructure holding the current mapping between logical addresses andphysical storage locations. This data structure can be stored, forexample, in DRAM 56 or in memory devices 28.

The functions of NAND controllers 36, MSPs 40, channel controllers 48and main controller 52 may be implemented, for example, using softwarerunning on suitable Central Processing Units (CPUs), using hardware(e.g., state machines or other logic), or using a combination ofsoftware and hardware elements. In some embodiments, NAND controllers36, MSPs 40, channel controllers 48 and/or main controller 52 maycomprise general-purpose processors, which are programmed in software tocarry out the functions described herein. The software may be downloadedto the processors in electronic form, over a network, for example, or itmay, alternatively or additionally, be provided and/or stored onnon-transitory tangible media, such as magnetic, optical, or electronicmemory.

The system configuration of FIG. 1 is an example configuration, which isshown purely for the sake of conceptual clarity. In alternativeembodiments, any other suitable memory system configuration can also beused. Elements that are not necessary for understanding the principlesof the present invention, such as various interfaces, addressingcircuits, timing and sequencing circuits and debugging circuits, havebeen omitted from the figure for clarity.

In the exemplary system configuration shown in FIG. 1, the channelcontrollers are comprised in the main controller, and the maincontroller, MSPs and memory devices are implemented as separateIntegrated Circuits (ICs). In an alternative embodiment, memory devices28, MSPs 40, channel controllers 48 and main controller 52 may beimplemented as separate ICs. Alternatively, each subset of devices 28with its respective MSP, can be fabricated on a common die or device.Further alternatively, the MSPs, channel controllers and main controllermay be fabricated in a common Multi-Chip Package (MCP) or System-on-Chip(SoC), separate from memory devices 28 and DRAM 56. In alternativeembodiments, the elements of system 20 can be partitioned into packagedICs or semiconductor dies in any other suitable way. Furtheralternatively, some or all of the functionality of main controller 52can be implemented in software and carried out by a processor or otherelement of the host system. In some embodiments, host 24 and maincontroller 52 may be fabricated on the same die, or on separate dies inthe same device package. The main controller, channel controllers, MSPsand NAND controllers are collectively regarded herein as a memorycontroller, which carries out the methods described herein.

Redundant Data Storage in Non-Volatile Memory

In some embodiments, system 20 stores data in devices 28 using aredundant configuration, which provides protection against hardwarefailures. Typically, system 20 stores certain data by computingredundancy information for the data, distributing the data over N−1storage locations, and storing the redundancy information in an N^(th)storage location (N≧2). The N storage locations are typically selectedto reside in different memory units, e.g., different dies, memorydevices or channels. When using this sort of redundancy, the data can berecovered using the redundancy information even in the event of memoryunit failure.

The term “redundancy information” is used herein to describe anyinformation that enables recovering the data in the event that parts ofthe data are lost or corrupted. In an example embodiment, the redundancyinformation comprises a bit-wise exclusive-OR (XOR) of the N−1 dataportions. In another embodiment, the data is encoded with an ErrorCorrection Code (ECC) that produces redundancy bits, the data is storedin N−1 locations, and the redundancy bits are stored in the N^(th)location. Further alternatively, any other suitable type of redundancyinformation can also be used.

Although the embodiments described herein refer mainly to storage ofdata in N−1 locations and redundancy information in an additional N^(th)location, the disclosed techniques are not limited to thisconfiguration. In alternative embodiments, system 20 may store the datain N−K locations, 1≦K<N, and store the redundancy information in Klocations.

The description that follows refers mainly to memory dies as the basicmemory unit. This choice, however, is made purely by way of example. Inalternative embodiments, the techniques described herein can be appliedto any other suitable memory units, such as groups of dies, individualmemory devices, groups of memory devices, or SSD channels. Furtheralternatively, each memory unit may comprise a memory plane or even aword line, group of word lines or memory block. Typically, aconfiguration that spreads the data and redundancy across multiplememory units provides protection against failure of that type of memoryunit. For example, when the basic memory unit is a die, the disclosedtechniques provide protection against die failure. When the basic memoryunit is a SSD channel, the disclosed techniques provide protectionagainst channel failure.

Data storage in non-volatile memory has certain unique characteristics.In particular: (1) data is stored page by page, (2) stored data cannotbe updated in-place and therefore the updated data needs to be stored inanother physical location, (3) data is erased block by block, each blockcontaining multiple pages, and (4) non-volatile memory is specified toendure only a limited number of programming and erasure cycles. Whenimplementing redundant storage schemes, these characteristics may havemajor impact on system performance.

Consider, for example, N−1 data portions that are stored in N−1 blocks,and redundancy information that is stored in an N^(th) block. Updatingany portion of the data involves updating the redundancy information, aswell. Therefore, each data portion update causes two programmingoperations—one for updating the data portion and one for updating theredundancy information. As a result, the total number of programming anderasure cycles in the system may increase considerably. Moreover, thestorage location holding the redundancy information may suffer fromsevere cycling and wearing, since it is updated for any data portionupdate.

Embodiments of the present invention provide improved methods andsystems for redundant storage in non-volatile memory. These techniquesprovide protection against memory unit failure, while minimizing theeffects of redundant storage on the lifetime and performance of thesystem.

With reference to FIG. 1, the disclosed techniques may be carried out bymain processor 54, channel controllers 48, MSPs 40, NAND controllers 36,or any suitable combination or subset of these processors. The processoror processors that carry out the disclosed techniques in a givenconfiguration is sometimes referred to as “processing circuitry.” Thedisclosed techniques can be partitioned among the different processorsin any suitable way.

Redundant Storage in Non-Volatile Memory Using Superblocks

In some embodiments, system 20 defines superblocks by associating memoryblocks from different dies. Each superblock comprises N blocks, eachblock selected from a different die. In some embodiments, each set of Ncorresponding pages in the N blocks of a given superblock is referred toas a super-page. All the blocks in a given superblock are compacted anderased together, and all the pages in a given super-page are programmedtogether.

FIG. 2 is a diagram that schematically illustrates a memory superblock,in accordance with an embodiment of the present invention. The exampleof FIG. 2 shows N memory dies denoted #1 . . . #N, each die comprisingmultiple memory blocks 58. The figure shows a superblock 56, whichcomprises N blocks 62A . . . 62N. Each block in the superblock residesin a different die. As will be explained below, all the blocks in thesuperblock are erased together when system 20 compacts the data that isstored in memory.

Each memory block 58 comprises multiple pages 66. In some embodiments,system 20 regards each set of corresponding pages in the superblock(i.e., N pages, one from each block of the superblock) as a super-page.The pages of a given super-page are programmed together. As can be seenin the figure, the blocks of a given superblock are not necessarilylocated in the same location in their respective dies. Similarly, thepages of a given super-page are not necessarily located in the samelocation in their respective blocks.

When using super-pages, system 20 accumulates N−1 pages of incoming data(e.g., in a volatile memory buffer) and then calculates the redundancyinformation for the N−1 pages. Then, system 20 stores the data andredundancy information in a single super-page—The data is stored in N−1pages and the redundancy information in the N^(th) page. Therefore, theredundancy information (N^(th) page) is updated only once per N−1 pagesand not per each incoming page. This technique reduces the additionalcycling caused by the redundant storage. If a certain die fails, system20 can recover a given page stored in that die using (1) the other N−2data pages in the super-page and (2) the redundancy information storedin the N^(th) page of the super-page.

In some embodiments, system 20 carries out a “garbage collection”process, which compacts valid data that is stored in the differentmemory blocks and clears blocks for erasure and new programming. Datamay become invalid, for example, when updated data for a certain logicaladdress arrives, and is stored by the system in another physicallocation. The old data for this logical address is invalid, but cannotbe erased immediately because the block may contain other data that isstill valid. As the system continues to update data, regions of invaliddata accumulate in the memory blocks. The garbage collection processcompacts the data and attempts to minimize the invalid data regions.

(In the present context, the term “valid data” refers to the most recentcopy of the data corresponding to a given logical address, or to thedata that is pointed to by the system. Thus, data typically becomesinvalid when a more recent copy of the data is stored and pointed to bythe system. The term “invalid data” means a previous copy of certaindata, which is no longer pointed to by the system.)

For example, the garbage collection process may consolidate valid datafrom two or more blocks into one or more new blocks. As another example,the garbage collection process may copy data from a partially-validblock (i.e., a block having some valid and some invalid pages) into anew block. A process of this sort may be carried out when new dataarrives, or as a background process. Typically, the garbage collectionprocess copies valid data from one or more partially-valid blocks toother blocks, and then erases the partially-valid blocks that are nowobsolete. The erased blocks are available for new programming.

In some embodiments, system 20 carries out the compaction and garbagecollection process at a superblock level. In particular, system 20erases all the blocks of a given superblock together. If the totalnumber of blocks is sufficiently large, the number of actual writeoperations per user write (“write efficiency”) does not changesignificantly when garbage collection is done by blocks or bysuperblocks, because the percentage of valid data in each block (orsuperblock) that is merged during the garbage collection process isapproximately the same. The advantage of performing garbage collectionat the superblock level stems from the number of write operations thatare needed for storing the redundancy information.

Assume, for example, that a single redundancy block is stored per N−1data blocks. If garbage collection is carried out at the individualblock level, then for the case of random data programming (i.e., thehost sends programming commands to random logical addresses), whenever ablock is changed, the appropriate redundancy block should also bechanged, by issuing another block programming command to the samelogical address of the former redundancy block (in a different memorydevice). Therefore, the number of block write operations for the case ofrandom programming is doubled due to the need to update the redundancy.

On the other hand, if garbage collection is performed at the superblocklevel, the redundancy block is part of the same superblock as the data.Therefore, even for random write operations, only a single writeoperation is issued for each user write operation. Therefore, usingsuperblock-level garbage collection vs. block level garbage collectiondoubles the programming speed and reduces the wear of the blocks by afactor of two.

The term “erased together” means that erasure of the memory blocks in agiven superblock is performed without allowing any interveningprogramming operations on the superblock. In other words, the blocks ina superblock can be erased in parallel, semi-parallel or evensequentially, as long as no other programming operations are performedin the superblock until all the blocks are erased. Intervening readoutoperations may be allowed.

FIG. 3 is a flow chart that schematically illustrates a method forredundant data storage using superblocks, in accordance with anembodiment of the present invention. The method begins with system 20defining superblocks, and super-pages within the superblock, at asuperblock definition step 70.

After defining the superblock, system 20 carries out two processesirrespective of one another, namely a data storage process and a garbagecollection process. In the data storage process, system 20 accumulatesN−1 pages of data that is accepted for storage from host 24, at a dataaccumulation step 72. The system computes redundancy information for theaccumulated data, at a redundancy computation step 76. For example, thesystem may calculate the bit-wise XOR of the N−1 data pages, to producean N^(th) redundancy page. The system then stores the N pages (N−1 datapages and redundancy page) in the N pages of a given super-page, at astorage step 80. Thus, each of the N pages is stored on a different die.As a result, the data can be recovered using the redundancy informationin the event of die failure.

In the garbage collection process, system 20 selects one or morepartially-valid superblocks (superblocks whose blocks contain some validand some invalid pages) for data consolidation (also referred to ascompaction), at a superblock selection step 84. System 20 may selectsuperblocks for compaction using any suitable criterion. Typically, thecriterion depends on the distribution of invalid data in thesuperblocks. For example, the system may select to compact thesuperblocks having the largest number of invalid pages. The system thenconsolidates the valid data from the selected superblocks, at a dataconsolidation step 88. The compaction process may copy the valid datafrom two or more selected superblocks to a new superblock, copy thevalid data from one partially-valid superblock into anotherpartially-valid superblock, or copy the valid data from a singleselected superblock to a new superblock, for example.

The compaction process of step 88 produces at least one obsoletesuperblock, whose valid data was copied to another superblock. System 20erases the obsolete superblock, at a superblock erasure step 92. Asnoted above, all the blocks in the superblock are erased withoutallowing any intervening programming operations in the superblock.

When performing the compaction process of step 88, system 20 may copyvalid data from a source superblock to a destination superblock usingintra-die and/or inter-die copy operations. In some embodiments, thesystem allows valid data to be copied from a source block in one die toa destination block in a different die. For example, the system may readthe valid data from the entire source superblock, and write itsequentially to the destination superblock. Allowing inter-die copyoperations enables a high degree of flexibility, and potentiallyachieves tighter compaction. Inter-die copy operations may beparticularly advantageous when the invalid pages (“holes”) are notdistributed uniformly across the blocks of the source superblock. On theother hand, inter-die copy operations cause considerable data traffic topass through MSPs 40, channel controllers 48 and/or main processor 54.

In alternative embodiments, system 20 performs superblock compactionusing only intra-die copy operations. In other words, when copying validdata from a source superblock to a destination superblock, data can onlybe copied between source and destination blocks in the same die. Thislimitation may degrade the compaction slightly, but on the other handsimplifies the compaction process significantly. For example, the copyoperations can be performed by NAND controllers 36 internally in memorydevices 28, without transferring data through the MSPs, channelcontrollers and main processor. Partial limitations are also possible.For example, inter-die copy operations may be permitted only betweendies in the same memory device, between dies that are managed by thesame MSP, or between dies that are managed by the same channelcontroller.

In some cases, after the valid data has been copied from the sourcesuperblock(s) to the destination superblock, the destination superblockin still not fully-programmed. In a scenario of this sort, system 20 mayselect an additional source superblock (e.g., the superblock having thenext-highest number of invalid pages), and copy valid data from thissuperblock to the destination superblock. The system marks the copiedpages as invalid in the new source superblock. If necessary, thisprocess can be repeated (by progressively adding additional sourcesuperblocks) until the destination superblock is full.

FIG. 4 is a diagram that schematically illustrates a process ofsuperblock compaction, in accordance with an embodiment of the presentinvention. Initially, a first source superblock 94 is compacted into adestination superblock 94C. As can be seen in the figure, the validsuper-pages in superblock 94A are fragmented, and the memory controllerthus copies them in a compact manner to superblock 94C. After superblock94A has been compacted and cleared for erasure, the memory controllerselects superblock 94B as the next superblock for compaction. Sincedestination superblock 94C still has super-pages that are available forprogramming, the memory controller copies the valid super-pages fromsource superblock 94B to destination superblock 94C. This process maycontinue as long as the destination superblock is not full.

In some cases, the number of invalid pages (“holes”) may varyconsiderably from die to die. For example, some dies may hold staticdata that is rarely updated, and therefore have few invalid pages. Otherdies may hold data that is updated frequently, and therefore have alarge number of invalid pages. In system configurations that allow onlyintra-die copy, the compaction of superblocks may be degradedconsiderably. In some embodiments, system 20 avoids unbalanced scenariosof this sort by running a background process that copies blocks from dieto die.

In some embodiments, system 20 classifies the data for storage torarely-updated (“static” or “cold”) data and frequently-updated(“dynamic” or “hot”) data. Using the classification, the system storesthe rarely-updated data and frequently-updated data in separatesuperblocks. This sort of separation increases the compactionefficiency. In contrast, if rarely-updated and frequently-updated datawere to be stored in the same superblock, the system may have to copyrarely-updated data to another superblock unnecessarily, in order to beable to erase the superblock, even though the rarely-updated data hasfew “holes.” In some embodiments, system assesses the update frequencyof the data (e.g., by recording the most recent time at which eachlogical address was programmed). Using this assessment, the system canassign rarely-updated data and frequently-updated data to separatesuperblocks.

In some embodiments, when a certain block in a superblock fails, system20 redefines the superblock to have a smaller number of blocks. Thefailed block is thus removed from the redefined superblock. Consider,for example, a scheme in which each superblock comprises N blocks havingcorresponding physical addresses in N different memory units (dies inthe present example). The N blocks in each superblock comprise N−1 datablocks and a redundancy block. In an embodiment, if a certain block in agiven superblock fails (typically marked by the system as a bad block),the system redefines the superblock to include only the remaining N−1blocks, e.g., N−2 data blocks and a redundancy block. If another blockin the superblock fails, the process may continue by redefining thesuperblock to include a total of N−2 blocks.

This technique avoids the need to redefine the entire assignment ofblocks to superblocks in the system in the event of block failure. Inother words, the effect of a failed block is confined to the superblockcontaining that failed block. Although the description above refers to ascheme in which each superblock comprises blocks having correspondingphysical addresses in multiple dies, the disclosed technique is notlimited to this assignment and can be used in superblocks having anyother suitable block selection.

In the above embodiments, the system stores data in K blocks of a givensuperblock and redundancy information in the remaining N−K pages of thegiven superblock. In alternative embodiments, however, there is nodedicated partitioning between blocks used for storing data and blocksused for storing redundancy information. For example, the data andredundancy information in a given superblock may be interleaved over theN blocks, such that at least one of the blocks (and often all N blocks)in the superblock comprises both data and redundancy information.

Host-Assisted Redundant Storage

As explained above, system 20 receives from host 24 data that isaddressed using logical addresses, and stores the data in respectivephysical storage locations in memory devices 28. The translation betweenlogical addresses and physical storage locations is performed inaccordance with a logical-physical address mapping that is defined andupdated by main controller 52.

In some of the redundant storage configurations described herein, theredundancy scheme is defined and carried out entirely within system 20.In these configurations, the host has no knowledge or involvement in theredundant storage process, and may not even be aware that redundancyinformation is produced and stored. The following description, on theother hand, presents embodiments in which the redundancy scheme isapplied by the host, based on information regarding thelogical-to-physical address mapping that is provided by system 20.

In some embodiments, main controller 52 provides to host 24 informationregarding the logical-physical address mapping. In particular, theinformation provided by controller 52 indicates which logical addressesare mapped to each die. For example, controller 52 may send to host 24 aset of logical address ranges, each range comprising the logicaladdresses mapped to a respective die. Using this information, the hostcan cause a particular data item to be stored in a particular die, byassigning this data item a logical address that is mapped to the desireddie. In other words, the information reported by controller 52 enablesthe host to control the physical storage locations in which system 20will store data, even though system 20 applies logical-to-physicaladdress translation. This mechanism enables the host to implementvarious redundancy schemes in system 20, by assigning appropriatelogical addresses to the data.

For example, the host may generate redundancy information for certaindata, and assign the data and the redundancy information logicaladdresses that cause them to be stored in different dies. In an exampleembodiment, the host accumulates N−1 data pages, and calculates aredundancy page over the data using bit-wise XOR. Then, the host assignsthe N−1 data pages and the N^(th) redundancy page N logical addresses,which cause the N pages to be stored in N different dies. The host isable to assign the logical addresses appropriately, based on theinformation regarding the logical-physical mapping that is reported bycontroller 52. In alternative embodiments, the host can use informationregarding the logical-physical mapping to implement any suitableredundant storage scheme, such as various RAID schemes.

Since the logical-physical address mapping typically changes over time,system 20 typically updates the host with these changes. The updates canbe reported as they occur, or on a periodic basis. In some embodiments,system 20 and host 24 support an interface, over which controller 52reports the information regarding the logical-physical address mappingto the host.

FIG. 5 is a flow chart that schematically illustrates a method forredundant data storage, in accordance with another embodiment of thepresent invention. The method begins with main controller 52 in system20 defining a mapping between logical addresses and physical storagelocations, at a mapping definition step 100. Controller 52 reportsinformation regarding the mapping to host 24, at a mapping reportingstep 104. In particular, controller 52 indicates to host 24 whichlogical addresses are mapped to each die.

When preparing to store data in system 20, host 24 assigns logicaladdresses to the data based on the reported information regarding themapping, at an address assignment step 108. Typically, the host producesredundancy information for the data. Then, the host assigns the data andthe redundancy information logical addresses, which will cause system 20to store them in different dies. The host may implement any suitableredundancy scheme in system 20 using this technique, such as variousRAID schemes.

Host 24 sends the data and redundancy information to controller 52 insystem 20 for storage, at a sending step 112. Each data item (includingdata and redundancy information) is sent with the respective logicaladdress that was assigned at step 108 above. System 20 stores the dataand redundancy information in memory devices 28, at a storage step 116.In the storage process, system 20 translates the logical addresses torespective physical storage locations according to the mapping, andstores the data items at these physical storage locations. Since thelogical addresses were assigned by the host based on the reportedmapping, each data item will be stored in a different die. If a certaindie fails, host 24 can recover the data item stored on that die based on(1) the data items that is stored on the remaining dies and (2) theredundancy information.

Using Obsolete Blocks for Temporary Backup

In some embodiments, system 20 carries out a garbage collection process,which copies valid data from partially-valid blocks (“source blocks”)and stores the data in a compact manner in other blocks (“destinationblocks”). Once all valid data is copied from a given source block, thesource block is regarded as obsolete and may be erased.

In some embodiments, system 20 uses the obsolete data that is stored inobsolete source blocks as temporary backup, and recovers the data fromthe obsolete blocks in the event of failure in the destination blocks.In some embodiments, system 20 implements this temporary backup schemeby (1) specifying that the source and destination blocks of any givendata item will reside in different dies, and (2) delaying erasure ofobsolete blocks as much as possible. These two guidelines increase thelikelihood that, when a given die fails, the data of this die can stillbe recovered from obsolete blocks that formerly held the data.

FIG. 6 is a flow chart that schematically illustrates a method forgarbage collection and recovery from die failure, in accordance withanother embodiment of the present invention. The method begins withsystem 20 selecting a source block for garbage collection, at a sourceselection step 120. The selected source block resides in a certain die,denoted die X. System 20 then selects a destination block for copyingthe valid data from the source block, at a destination selection step124. System 20 selects the destination block in a die denoted Y, whichis different from die X. System 20 copies the valid data from the sourceblock in die X to the destination block in die Y. System 20 then marksthe source block as obsolete, i.e., ready for erasure.

In some embodiments, system 20 delays the actual erasure of the sourceblock, at a delaying step 128. The delay in erasure extends the timeperiod in which the data in the source block, although marked asinvalid, is still available for readout if necessary. System 20 candelay erasure of the source block in various ways. For example, system20 may hold the obsolete blocks without erasing them, until a new blockis needed for programming. As another example, the system may hold arelatively small pool of blocks that are erased and ready forprogramming, and delay erasure of the remaining obsolete blocks.

At a certain point in time, system 20 identifies a failure in die Y (thedie comprising the destination block), at a failure detection step 132.System 20 then checks whether an obsolete replica of the data of die Yis still available for recovery. In particular, for the destinationblock in question, system 20 checks whether the obsolete source block indie X (from which the data was copied to the destination block) has beenerased, at an erasure checking step 136. If the obsolete source blockwas erased, the method terminates without recovering the data, at afailure termination step 140. If, on the other hand, the obsolete sourceblock in die X was not erased yet, system 20 recovers the data inquestion by reading it from the source block, at a recovery step 144.

Typically, the scheme of FIG. 6 does not provide guaranteed backup,since the data is available for recovery only until the obsolete blockis erased. On the other hand, this scheme provides some degree of backupwithout requiring additional memory or other system resources.

In the above embodiments, the source and destination blocks are selectedin different dies. In alternative embodiments, however, the source anddestination blocks may be selected in the same die, and the disclosedtechnique can be used for recovery from failure in the source block.

Redundancy Schemes Using Logical Addressing

In some embodiments, system 20 implements a redundancy storage schemebased on the logical addresses (sometimes referred to as Logical BlockAddresses—LBAs) that are assigned to the data by the host. In an exampleembodiment, main controller 52 in system 20 assigns a dedicated LBA foreach group of N−1 LBAs assigned by the host. The dedicated LBA (alsoreferred to as “redundancy LBA”) is used for storing redundancyinformation that is computed over the data of the N−1 LBAs in the group(also referred to as “host LBAs”). Controller 52 typically assigns theredundancy LBAs in a range of LBAs that is not accessible to host 24.

Typically, system 20 ensures that each of the N LBAs in the group (theN−1 host LBAs and the redundancy LBAs) is stored in a different die. Forexample, when defining the logical-physical address mapping, controller52 may assign each die a certain respective range of LBAs, which doesnot overlap the ranges assigned to the other dies. Then, when selectinga group of N−1 LBAs for redundant storage, controller 52 selects eachLBA from the range of a different die. Alternatively, controller 52 mayensure or verify that each LBA in the group maps to a different dieusing any other suitable method.

When using this sort of redundant storage for random write operations,any update of one of the host LBAs causes two programming operations—oneprogramming operation of the updated host LBA and one programmingoperation of the redundancy LBA. Therefore, controller 52 attempts toavoid situations in which the redundancy LBAs are assigned in the samedie or in a small group of dies. A situation of this sort would causesevere cycling and wear of the die or dies holding the redundancy LBAs.Thus, in some embodiments, controller 52 attempts to distribute theredundancy LBAs of different LBA groups across multiple dies. Forsequential programming, the updating overhead is smaller. Therefore,this technique is particularly useful in applications that are dominatedby random write operations, e.g., in transaction servers. Nevertheless,the technique is useful for sequential storage applications, as well.

Additionally or alternatively, system 20 may store the redundancy LBAsusing a storage configuration that is more resilient to severe cycling,in comparison with the storage configuration used for storing the hostLBAs. For example, system 20 may partition each die into two subsets ofmemory cells. One subset, referred to as a high-endurance area, storesdata at a lower density (lower number of bits/cell, smaller number ofprogramming levels), but on the other hand at a higher endurance. Theother subset, referred to as a high-density area, stores data at ahigher density, but at a lower resilience to cycling. System 20 storesthe host LBAs in the high-density area, and the redundancy LBAs in thehigh-endurance area.

Because of the lower storage density in the high-endurance area, thememory cells in this area can tolerate considerably more endurance,i.e., higher numbers of programming and erasure cycles, relative to thememory cells in the high-density area. As such, it is preferable tostore the redundancy LBAs in the high-endurance area. This technique,however, causes some degradation in capacity due to the lower storagedensity in the high-endurance area.

System 20 can define the high-endurance and high-density storageconfigurations in any desired manner. For example, the system may storethe host LBAs in the high-density area using N programming levels(programming states) per cell, and store the redundancy LBAs in thehigh-endurance area using M programming levels, M<N. In an exampleembodiment, system 20 stores the host LBAs using a Multi-Level Cell(MLC) configuration having four levels or eight levels, and theredundancy LBAs using a Single-Level Cell (SLC) configuration having twoprogramming levels. As another example, the host LBAs may be storedusing three programming levels, and the redundancy LBAs using twobits/cell (SLC). Alternatively, any other suitable values of N and M canbe used.

FIG. 7 is a flow chart that schematically illustrates a method forredundant data storage using logical address redundancy, in accordancewith another embodiment of the present invention. The method begins withmain controller 52 defining a mapping between logical addresses (LBAs)and physical storage locations, at a mapping specification step 150.Controller 52 selects N LBAs that are mapped to N respective physicalstorage locations in N different dies, at a group selection step 154.One of the N LBAs (the redundancy LBA) is selected from a range of LBAsthat is not accessible to the host, and is designated for storingredundancy information. The other N−1 are referred to as host LBAs. Thegroup of N LBAs is referred to as a logical stripe.

Controller 52 stores incoming data from the host in the host LBAs, andredundancy information in the redundancy LBA. Consider, for example, newdata that arrives from the host and is addressed to one of the host LBAsin the group. Upon receiving this data, controller 52 updates theredundancy information for this group to reflect the new data.Controller 52 then stores the data in the host LBA, at a first storagestep 158, and the updated redundancy information in the redundancy LBA,at a second storage step 162. In some embodiments, controller 52 storesthe host LBA at high density (e.g., in a high-density area assigned inthe respective die), and the redundancy LBA at a high-enduranceconfiguration (e.g., in a high-endurance area assigned in the respectivedie).

The description of FIG. 7 refers to a single logical stripe, i.e., asingle group of N LBAs. Generally, however, system 20 defines multiplelogical stripes. In an example embodiment, the entire range of LBAs isdivided into logical stripes, i.e., groups of N LBAs. Data storagewithin each logical stripe is carried out using the method of FIG. 7.

Recovery from Die Failure Using ECC

In some embodiments, system 20 protects the stored data from diefailures by encoding the data with an ECC, and distributing the bits ofeach ECC code word over multiple dies. Consider, for example, a group ofthirty-three dies. In an example embodiment, system 20 encodes incomingdata, to produce ECC code words. Each code word comprises multiple bits,e.g., several hundred or several thousand bits. System 20 distributes(interleaves) the bits of each code word across the thirty-three dies.If any single die fails, approximately 3% of the bits in each code wordwill be lost. For a properly chosen ECC, the entire data can berecovered from the code words in the absence of the 3% lost bits.

In some embodiments, the lost bits can be pointed out to the ECC decoderas erasures, since the identity of the failed die is often known. SomeECC types, such as Reed-Solomon (RS) or Low-Density Parity Check (LDPC)codes, have a considerably higher correction capability when providedwith erasure information. This technique is useful not only for diefailures that cause total data loss, but also for failures that causehigh bit error rate on one or more dies. In alternative embodiments, thebits are associated with respective soft metrics (e.g., Log LikelihoodRatios—LLRs), and the ECC decoder decodes the code word using the softmetrics.

In some embodiments, system 20 may apply any of the redundant storagetechniques disclosed herein (e.g., any of the disclosed logical and/orphysical addressing schemes, and/or any other disclosed techniques forusing spare memory units) in storing the subsets of bits of the ECC codeword in the multiple memory units.

Spare Dies for Replacing Failed Dies

In some embodiments, system 20 comprises one or more spare dies that arereserved for replacing failed dies. In some embodiments, a spare die isnot used for storing data until it replaces a failed die. In alternativeembodiments, the spare dies provide additional physical storage space,which can be used for storing data at improved performance. In the eventof die failure, less physical space is available, but the availablephysical space still meets the system specification. Any of theredundant storage techniques described herein can be used in conjunctionwith spare dies.

Consider, for example a system that normally performs redundant storagein N dies. In such a system, an N+1^(th) die can be added. The N+1^(th)is not used as long as the other N dies are functional. If one of the Ndies fails, the data stored in the failed die can be recovered based onthe redundancy information, and copied to the spare die. From thatpoint, the system can continue to operate with N dies as before. In someembodiments, the data of the failed die can be restored to the spare diegradually. For example, whenever certain data is accessed, the data isrestored and then programmed to the spare die. Alternatively, the entirecontent of the failed die can be restored upon detecting that the diehas failed. In some embodiments, when memory devices 28 compriseDual-Die Packages (DDP), the system inherently comprises a spare die.

In alternative embodiments, the spare dies are used during normaloperation, and therefore increase the ratio between physical storagespace and the logical address range accessible by the host. Duringnormal operation, the additional physical storage space can be used forvarious purposes. For example, the additional physical space can used toincrease the Over-Provisioning ratio of system 20, i.e., the averagenumber of invalid data regions (“holes”) per block. Higherover-provisioning overhead increases the efficiency of the garbagecollection process, and therefore improves programming throughput. Whena certain die fails and is replaced by a spare die, theover-provisioning overhead decreases due to the lower available physicalstorage space.

Hierarchical Calculation of Redundancy Information

In the different redundant storage schemes described above, system 20computes redundancy information for data items that are stored onmultiple dies. In some embodiments, system 20 can distribute thiscomputation over multiple processors, such that each processor computespartial redundancy information for the dies it is associated with. Inthese embodiments, it is assumed that the redundancy calculation can bepartitioned into several independent sub-calculations whose results canbe subsequently combined. Exclusive OR (XOR) redundancy, for example,meets this condition.

Consider, for example, the configuration of FIG. 1 above, and consider aconfiguration in which the redundancy information is computed over diesthat are located in all memory devices 28. In some embodiments, theredundancy information can be computed in a centralized manner by maincontroller 52. Alternatively, however, each MSP 40 can calculate partialredundancy information over the dies it is responsible for. Then, maincontroller 52 can combine the multiple partial redundancy information inorder to derive the total redundancy information pertaining to all thedies.

As another example, each MSP 40 calculates partial redundancyinformation over its respective dies, and each channel controllercombines the partial redundancy information from its respective MSPs toproduce a channel-level result. The main controller combines thechannel-level results of the channel controllers to produce the totalredundancy information. Alternatively, system 20 can partition theredundancy computation in any other suitable manner. This techniquereduces the computational burden on the main controller and channelcontrollers, and also increases storage speed.

Fast Readout of Redundantly-Stored Data

As described above, in some embodiments the memory controller storesdata in N memory units (e.g., dies), and redundancy information for thedata in an N+1^(th) memory unit. In order to retrieve the data, thememory controller issues read commands to the memory units. In somepractical cases, however, some memory units may be slow in responding tothe read commands. For example, a given memory unit may be busy withanother task (e.g., erasure, read threshold estimation, programming orreadout) and will therefore respond after a relatively long delay. Insome embodiments, the memory controller reduces the sensitivity of thereadout process to delays by reconstructing the data without waiting forthe slowest memory unit.

In an example embodiment, the memory controller issues a respective readcommand to each of the N+1 memory units, including both the N memoryunits holding the data and the N+1^(th) memory unit holding theredundancy information. As soon as the first (fastest) N memory unitsrespond, the memory controller reconstructs the data from the Nresponses without waiting for the N+1^(th) response. In some cases, thefirst N memory units to respond are the ones holding the data (i.e., thememory unit holding the redundancy information is the slowest), in whichcase the memory controller has all the data ready. In other cases, thefirst N memory units to respond are N−1 of the memory units holding thedata, and the memory unit holding the redundancy information. In thiscase, the memory controller reconstructs the data of the remainingmemory unit, which has not yet responded, based on the data read fromthe other N−1 memory units and the redundancy information read from theN+1^(th) memory unit.

FIG. 8 is a flow chart that schematically illustrates a method forredundant data readout, in accordance with an embodiment of the presentinvention. The method begins with the memory controller storing N dataunits (N data pages in the present example) in N respective dies, and aredundancy data unit (a redundancy page in the present example) holdingredundancy information for the data in an N+1^(th) die, at a redundantstorage step 170. The memory controller may use any suitable redundancyscheme, such as any of the schemes described above.

At a later time, the memory controller is requested to retrieve thestore data. In order to retrieve the data, the memory controller issuesN+1 read commands to the N+1 dies in question, at a read instructionstep 174. The memory controller waits until the first (fastest) N pagesarrive, at an arrival checking step 178. As soon as the first N pagesarrive, the memory controller reconstructs the data based on these Npages, at a data reconstruction step 182, irrespective of the N+1^(th)page. If the first N pages are the data pages, the data is alreadyavailable. If the first N pages are N−1 data pages and the redundancypage, the memory controller reconstructs the data of the N+1^(th) pagebased on the N data pages and the redundancy page. The memory controllerthen outputs the reconstructed data. Using this technique, a die (orother memory unit) that is slow to respond to readout commands does notdelay the entire readout process.

In some embodiments, system 20 supports an interface using which thememory units report their expected response latencies to the memorycontroller. In n example embodiment, the memory controller queries eachmemory unit before retrieving the data, and in return each memory unitreports its expected response latency. The memory controller selects theN memory units having the lowest expected latencies (from among the N+1memory units in question), and issues read commands only to the Nselected memory units. If two or more memory units report the sameexpected latency, the memory controller may choose between themaccording to another criterion, e.g., the memory units having thesmallest number of pending read requests. Alternatively, any othersuitable criterion can be used.

In some embodiments, the slow response time of a given die may be causedby various background tasks (e.g., garbage collection and/orhousekeeping tasks) that are performed by the memory controller on thememory units. In some embodiments, the memory controller schedulesbackground tasks in the different memory units (e.g., dies), so as toenable fast readout. For example, the memory controller may schedulebackground tasks in the N+1 memory units using “round robin” scheduling,such that at any given time there are N memory units that are not busyexecuting background tasks.

In some cases, the N+1^(th) memory unit was slow to respond because theread thresholds that were used for reading the memory cells were notpositioned properly. In some embodiments, after successfullyreconstructing the data based on the other N readout results, the memorycontroller provides the reconstructed data of the N+1^(th) memory unitto the N+1^(th) memory unit. This data can be used by the N+1^(th)memory unit, for example, to find the correct locations of the readthresholds. In alternative embodiments, the reconstructed data providedto the N+1^(th) memory unit by the memory controller can be used by theN+1^(th) memory unit to improve, speed-up or generally enable subsequentreadout in any other suitable way.

In the above embodiments, the memory controller stores data in N memoryunits and redundancy information in an N+1^(th) memory unit, andrecovers the data using the N memory units that are fastest to respondto read requests. In alternative embodiments, the memory controller maystore data in a total of N+K memory units, such that N memory units holddata and the remaining K memory units hold redundancy information. Thememory controller then may recover the data using any partial subset ofthe memory units that are fastest to respond.

In the above embodiments, the memory controller issues read requests toall N+K memory units. In some embodiments, however, the memorycontroller may issue less than N+K read commands. For example, thememory controller may have information indicating that a given memoryunit is busy with another task. In this scenario, the memory controllermay a-priori refrain from issuing a read command to this memory unit,and prefer to recover the data from the remaining memory units.

Fast Redundant Programming of Burst Data

In some embodiments, the memory controller applies a programming schemethat stores incoming data efficiently in RAID stripes, and performsredundancy information updates only during block compaction (“garbagecollection”). In an example embodiment, the memory controller storesdata in groups of N logical pages that are referred to as RAID stripes.Each RAID stripe comprises a certain amount of incoming user data (e.g.,N−K pages) and a certain amount (e.g., K pages) of redundancyinformation computed over the user data. The memory controller maintainsa database of pointers that point to the physical storage addresses ofthe N pages in each RAID stripe, which are typically distributed over Nmemory units.

As user data arrives, the memory controller arranges the data, withredundancy information, in RAID stripes and stores the RAID stripes inthe memory units. Since each all N pages of a given RAID stripe arestored at the same time, there is no need for updating of redundancyinformation during programming. As a result, the programming process isfast and efficient.

During the background compaction process, the memory controller selectsfor compaction memory blocks, which contain some invalid pages. Eachinvalid page belongs to a certain RAID stripe. The memory controllerapplies compaction by reading one or more invalid pages, reading theredundancy information of the RAID stripe(s) to which these invalidpages belong, replaces the data of the invalid pages with new data,updates the redundancy information based on the invalid pages and thenew data, and writes the new data and the updated redundancy informationto new physical storage locations. Finally, the memory controllerupdates the pointer database with the new physical storage locations ofthe new data and the updated redundancy information.

This scheme is advantageous, for example, in applications whereprogramming commands arrives in short bursts. In such applications, theprogramming throughput is not degraded by redundancy informationupdates. All redundancy updates are performed during the backgroundcompaction process.

The embodiments described herein refer mainly to redundant storageschemes in which resilience to failure is achieved by using memory unitsthat reside on different dies or memory devices. In alternativeembodiments, however, the memory units that are referred to in thedisclosed schemes may comprise logical memory units, which do notnecessarily reside on different dies or devices. For example, a givenmemory die may be partitioned into two or more logical memory units,each logical memory unit comprising a respective subset of the addressspace of that die. Any of the methods and systems described herein canbe carried out using such logical memory units. Hybrid solutions, inwhich memory units are distributed on different dies but each die maycomprise more than one memory unit, are also possible.

It will be appreciated that the embodiments described above are cited byway of example, and that the present invention is not limited to whathas been particularly shown and described hereinabove. Rather, the scopeof the present invention includes both combinations and sub-combinationsof the various features described hereinabove, as well as variations andmodifications thereof which would occur to persons skilled in the artupon reading the foregoing description and which are not disclosed inthe prior art.

The invention claimed is:
 1. A method for data storage, comprising: in amemory that includes at least N memory units, each memory unitcomprising memory blocks, defining superblocks, each superblockcomprising a respective set of N of the memory blocks that are allocatedrespectively in N different ones of the memory units, such thatcompaction of all the memory blocks in a given superblock is performedwithout any intervening programming operation in the given superblock;storing data in the memory by computing redundancy information for aselected portion of the data, and storing the selected portion and theredundancy information in the N memory blocks of a selected superblock;and reading data from a given superblock by issuing read commands to theset of N memory blocks within a given superblock without waiting fordata to arrive from a given memory block before issuing a subsequentread command to a next memory block, wherein reading the data from agiven superblock further comprises: receiving data from N−1 memoryblocks; and using the redundancy information to reconstruct the dataunless the redundancy information is in the Nth memory block to arrive.2. The method according to claim 1, wherein erasure of all the memoryblocks in the given superblock is performed without any interveningprogramming operation in the given superblock.
 3. The method accordingto claim 1, wherein storing the data comprises storing the selectedportion in K memory blocks of the selected superblock, 1≦K<N, andstoring the redundancy information in N−K remaining memory blocks of theselected superblock.
 4. The method according to claim 1, wherein storingthe data comprises programming the portions of the data and theredundancy information into N pages belonging respectively to the memoryblocks of the selected superblock, without any intervening programmingoperation.
 5. The method according to claim 1, and comprising, upon afailure in a given memory unit from among the N memory units, recoveringthe data using at least some of the data and the redundancy informationthat is stored in the memory units other than the given memory unit. 6.The method according to claim 1, and comprising compacting the datastored in a source superblock of the memory by copying valid data frompartially-valid memory blocks in the source superblock to a destinationsuperblock and subsequently erasing the source superblock.
 7. The methodaccording to claim 6, wherein copying the valid data comprises movingthe valid data from first blocks of the source superblock to respectivesecond blocks in the destination superblock, such that each second blockbelongs to the same memory unit as the respective first block.
 8. Themethod according to claim 7, and comprising running a background processthat copies parts of the data between different ones of the memoryunits.
 9. The method according to claim 6, wherein compacting the datacomprises, upon detecting that the source superblock no longer containsany valid data but the destination superblock is not full, selecting anadditional source superblock and copying additional valid data from theadditional source superblock to the destination superblock.
 10. Themethod according to claim 6, wherein compacting the data comprisesselecting a given superblock for compaction based on a distribution ofinvalid data in the superblocks.
 11. The method according to claim 10,wherein selecting the given superblock comprises choosing the givensuperblock having a highest amount of the invalid data among thesuperblocks to serve as the source superblock.
 12. The method accordingto claim 1, wherein defining the superblock comprises classifying inputdata to rarely-updated data and frequently-updated data, assigning firstsuperblocks for storing the rarely-updated data and assigning secondsuperblocks, different from the first superblocks, for storing thefrequently-updated data.
 13. The method according to claim 1, andcomprising reserving at least one spare memory unit in addition to the Nmemory units, and replacing a failed memory unit from among the N memoryunits with the spare memory unit.
 14. The method according to claim 13,and comprising temporarily using the spare memory unit for improvingperformance of data storage in the N memory units.
 15. The methodaccording to claim 1, wherein the memory units are partitioned intomultiple groups that are associated with respective multiple processors,and wherein storing the data comprises distributing computing of theredundancy information among the multiple processors.
 16. The methodaccording to claim 1, and comprising, in response to a failure in amemory block of a given superblock, redefining the given superblock toinclude only the blocks other than the memory block having the failure,and storing subsequent data in the redefined superblock.
 17. A methodfor data storage, comprising: in a memory controller that accepts dataitems from a host for storage in multiple memory units, defining amapping between logical addresses assigned by the host and respectivephysical storage locations in the memory units; reporting informationindicative of the mapping from the memory controller to the host; in thehost, computing redundancy information for the data items, and assigningrespective first logical addresses to the data items and second logicaladdresses to the redundancy information responsively to the reportedinformation indicative of the mapping; and storing the data items andthe redundancy information by the memory controller in the physicalstorage locations that correspond to the respective first and secondlogical addresses assigned by the host; and reserving a subset of thememory units for use as one or more spare memory units to replace onemore memory units in response to determining a memory unit has failedand using the one or more spare memory units to improve data storageperformance in the memory units until the one or more spare memory unitsis used as a replacement memory unit.
 18. The method according to claim17, wherein assigning the first and second logical addresses in the hostcomprises causing the data items and the redundancy information to bestored in different ones of the memory units.
 19. The method accordingto claim 17, wherein reporting the information comprises reportingrespective ranges of the logical addresses that are mapped to the memoryunits according to the mapping.
 20. The method according to claim 17,wherein reporting the information comprises, responsively to modifyingthe mapping in the memory controller, reporting to the host updatedinformation that is indicative of the modified mapping.
 21. A datastorage apparatus, comprising: an interface, which is configured tocommunicate with a memory that includes at least N memory units, eachmemory unit including memory blocks; and a processor, which isconfigured to: define superblocks, each superblock comprising arespective set of N of the memory blocks that are allocated respectivelyin N different ones of the memory units, such that compaction of all thememory blocks in a given superblock is performed without any interveningprogramming operation in the given superblock; store data in the memoryby computing redundancy information for a selected portion of the data,and storing the selected portion and the redundancy information in the Nmemory blocks of a selected superblock; and read data from a givensuperblock by issuing read commands to the set of N memory blocks withina given superblock without waiting for data to arrive from a givenmemory block before issuing a subsequent read command to a next memoryblock, wherein reading the data from a given superblock furthercomprises: receiving data from N−1 memory blocks; and using theredundancy information to reconstruct the data unless the redundancyinformation is in the Nth memory block to arrive.
 22. The apparatusaccording to claim 21, wherein the processor is configured to erase allthe memory blocks in the given superblock without any interveningprogramming operation in the given superblock.
 23. The apparatusaccording to claim 21, wherein the processor is configured to store theselected portion in K memory blocks of the selected superblock, 1K<N,and to store the redundancy information in N−K remaining memory blocksof the selected superblock.
 24. The apparatus according to claim 21,wherein the processor is configured to program the portions of the dataand the redundancy information into N pages belonging respectively tothe memory blocks of the selected superblock, without any interveningprogramming operation.
 25. The apparatus according to claim 21, wherein,upon a failure in a given memory unit from among the N memory units, theprocessor is configured to recover the data using at least some of thedata and the redundancy information that is stored in the memory unitsother than the given memory unit.
 26. The apparatus according to claim21, wherein the processor is configured to compact the data stored in asource superblock of the memory by copying valid data frompartially-valid memory blocks in the source superblock to a destinationsuperblock and subsequently erasing the source superblock.
 27. Theapparatus according to claim 26, wherein the processor is configured tocopy the valid data from first blocks of the source superblock torespective second blocks in the destination superblock, such that eachsecond block belongs to the same memory unit as the respective firstblock.
 28. The apparatus according to claim 27, wherein the processor isconfigured to run a background process that copies parts of the databetween different ones of the memory units.
 29. The apparatus accordingto claim 26, wherein, upon detecting that the source superblock nolonger contains any valid data but the destination superblock is notfull, the processor is configured to select an additional sourcesuperblock and to copy additional valid data from the additional sourcesuperblock to the destination superblock.
 30. The apparatus according toclaim 26, wherein the processor is configured to select a givensuperblock for compaction based on a distribution of invalid data in thesuperblocks.
 31. The apparatus according to claim 30, wherein theprocessor is configured to choose the given superblock having a highestamount of the invalid data among the superblocks to serve as the sourcesuperblock.
 32. The apparatus according to claim 21, wherein theprocessor is configured to classify input data to rarely-updated dataand frequently-updated data, to assign first superblocks for storing therarely-updated data and to assign second superblocks, different from thefirst superblocks, for storing the frequently-updated data.
 33. Theapparatus according to claim 21, wherein the processor is configured toreserve at least one spare memory unit in addition to the N memoryunits, and to replace a failed memory unit from among the N memory unitswith the spare memory unit.
 34. The apparatus according to claim 33,wherein the processor is configured to temporarily use the spare memoryunit for improving performance of data storage in the N memory units.35. The apparatus according to claim 21, wherein the memory units arepartitioned into multiple groups, and wherein the processor comprisesmultiple processors that compute the redundancy information in therespective groups.
 36. The apparatus according to claim 21, wherein theprocessor is configured, in response to a failure in a memory block of agiven superblock, to redefine the given superblock to include only theblocks other than the memory block having the failure, and to storesubsequent data in the redefined superblock.
 37. A memory controller,comprising: an interface for communicating with a host; and a processor,which is configured to: accept from the host via the interface dataitems for storage in multiple memory units; define a mapping betweenlogical addresses assigned by the host and respective physical storagelocations in the memory units; report information indicative of themapping from the memory controller to the host, so as to cause the hostto compute redundancy information for the data items and assignrespective first logical addresses to the data items and second logicaladdresses to the redundancy information responsively to the reportedinformation indicative of the mapping; store the data items and theredundancy information accepted from the host in the physical storagelocations that correspond to the respective first and second logicaladdresses assigned by the host; and reserve a subset of the memory unitsfor use as one or more spare memory units to replace one more memoryunits in response to determining a memory unit has failed and using theone or more spare memory units to improve data storage performance inthe memory units until the one or more spare memory units are used as areplacement memory units.
 38. The apparatus according to claim 37,wherein the first and second logical addresses assigned by the hostcause the data items and the redundancy information to be stored indifferent ones of the memory units.
 39. The apparatus according to claim37, wherein the processor is configured to report respective ranges ofthe logical addresses that are mapped to the memory units according tothe mapping.
 40. The apparatus according to claim 37, wherein theprocessor is configured to modify the mapping, and to reporting to thehost updated information that is indicative of the modified mapping. 41.A method for data storage, comprising: in a memory that includes atleast N memory units that are partitioned into memory blocks, groupinginput data into sets of N logical pages, each set containing arespective portion of the data and redundancy information that iscomputed over the portion; storing each set in the memory such that theN logical pages in the set are stored in N different memory units;reading data from a given superblock by issuing read commands to the setof N memory blocks within a given superblock without waiting for data toarrive from a given memory block before issuing a subsequent readcommand to a next memory block, wherein reading the data from a givensuperblock further comprises: receiving data from N−1 memory blocks; andusing the redundancy information to reconstruct the data unless theredundancy information is in the Nth memory block to arrive; andcompacting a selected memory block by reading at least one invalid pagefrom the selected memory block, reading the redundancy information ofthe respective set to which the invalid page belongs, updating the readredundancy information based on the read invalid page and on new data,and storing the updated redundancy information and the new data in thememory.
 42. A data storage apparatus, comprising: an interface, which isconfigured to communicate with a memory that includes at least N memoryunits that are partitioned into memory blocks; and a processor, which isconfigured to: group input data into sets of N logical pages, each setcontaining a respective portion of the data and redundancy informationthat is computed over the portion; store each set in the memory suchthat the N logical pages in the set are stored in N different memoryunits; read data from a given superblock by issuing read commands to theset of N memory blocks within a given superblock without waiting fordata to arrive from a given memory block before issuing a subsequentread command to a next memory block, wherein reading the data from agiven superblock further comprises: receiving data from N−1 memoryblocks; and using the redundancy information to reconstruct the dataunless the redundancy information is in the Nth memory block to arrive;and compact a selected memory block by reading at least one invalid pagefrom the selected memory block, reading the redundancy information ofthe respective set to which the invalid page belongs, updating the readredundancy information based on the read invalid page and on new data,and storing the updated redundancy information and the new data in thememory.